What's Worthy of Comment? Content and Comment Volume in Political Blogs

نویسندگان

  • Tae Yano
  • Noah A. Smith
چکیده

In this paper we aim to model the relationship between the text of a political blog post and the comment volume—that is, the total amount of response—that a post will receive. We seek to accurately identify which posts will attract a high-volume response, and also to gain insight about the community of readers and their interests. We design and evaluate variations on a latentvariable topic model that links text to comment volume. Introduction What makes a blog post noteworthy? One measure of the popularity or breadth of interest of a blog post is the extent to which readers of the blog are inspired to leave comments on the post. In this paper, we study the relationship between the text contents of a blog post and the volume of response it will receive from blog readers. Modeling this relationship has the potential to reveal the interests of a blog’s readership community to its authors, readers, advertisers, and scientists studying the blogosphere, but it may also be useful in improving technologies for blog search, recommendation, summarization, and so on. There are many ways to define “popularity” in blogging. In this study, we focus exclusively on the aggregate volume of comments. Commenting is an important activity in the political blogosphere, giving a blog site the potential to become a discussion forum. For a given blog post, we treat comment volume as a target output variable, and use generative probabilistic models to learn from past data the relationship between a blog post’s text contents and its comment volume. While many clues might be useful in predicting comment volume (e.g., the post’s author, the time the post appears, the length of the post, etc.) here we focus solely on the text contents of the post. We first describe the data and experimental framework, including a simple baseline. We then explore how latentvariable topic models can be used to make better predictions about comment volume. These models reveal that part of the variation in comment volume can be explained by the topic of the blog post, and elucidate the relative degrees to which readers find each topic comment-worthy. ∗The authors acknowledge research support from HP Labs and helpful comments from the reviewers and Jacob Eisenstein. Copyright c © 2010, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Predicting Comment Volume Our goal is to predict some measure of the volume of comments on a new blog post.1 Volume might be measured as the number of words in the comment section, the number of comments, the number of distinct users who leave comments, or a variety of other ways. Any of these can be affected by uninteresting factors—the time of day the post appears, a side conversation, a surge in spammer activity—but these quantities are easily measured. In research on blog data, comments are often ignored, and it is easy to see why: comments are very noisy, full of non-standard grammar and spelling, usually unedited, often cryptic and uninformative, at least to those outside the blog’s community. A few studies have focused on information in comments. Mishe and Glance (2006) showed the value of comments in characterizing the social repercussions of a post, including popularity and controversy. Their largescale user study correlated popularity and comment activity. Yano et al. (2009) sought to predict which members of blog’s community would leave comments, and in some cases used the text contents of the comments themselves to discover topics related to both words and user comment behavior. This work is similar, but we seek to predict the aggregate behavior of the blog post’s readers: given a new blog post, how much will the community comment on it?

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling Polarizing Topics: When Do Different Political Communities Respond Differently to the Same News?

Political discourse in the United States is getting increasingly polarized. This polarization frequently causes different communities to react very differently to the same news events. Political blogs as a form of social media provide an unique insight into this phenomenon. We present a multi-target, semisupervised latent variable model, MCR-LDA to model this process by analyzing political blog...

متن کامل

What pushes their buttons? Predicting comment polarity from the content of political blog posts

Political blogs as a form of social media allow for an uniquely interactive form of political discourse. This is especially evident in focused blogs with a strong ideological identity. We investigate techniques to identify topics within the context of the community, which when discussed in a blog post evoke a discernible positive or negative collective opinion from readers who respond to posts ...

متن کامل

Health Policy and Management: In Praise of Political Science; Comment on “On Health Policy and Management (HPAM): Mind the Theory-Policy Practice Gap”

Health systems have entered a third era embracing whole systems thinking and posing complex policy and management challenges. Understanding how such systems work and agreeing what needs to be put in place to enable them to undergo effective and sustainable change are more pressing issues than ever for policy-makers. The theory-policy-practice-gap and its four dimensions, as articulated by Chini...

متن کامل

Political Deliberation in the Blogosphere: The Case of the 2009 Portuguese Elections

In 2009, a unique Portuguese electoral cycle comprised european, local, and national elections. During the three month non-stop campaign period, more than a hundred experienced bloggers, supporters of the three main political parties, created three non party-sponsored blogs. These blogs were the focal point of the political blogosphere during that period and ceased their activities at the end o...

متن کامل

Political Impetus: Towards a Successful Agenda-Setting for Inclusive Health Policies in Low- and Middle-Income Countries; Comment on “Shaping the Health Policy Agenda: The Case of Safe Motherhood Policy in Vietnam”

Agenda-setting is a crucial step for inclusive health policies in the low- and middle-income countries (LMICs). Enlightened by Ha et al manuscript, this commentary paper argues that ‘political impetus’ is the key to the successful agenda-setting of health policies in LMICs, though other determinants may also play the role during the process. This Vietnamese case study presents a good example fo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010